Goto

Collaborating Authors

 training score-based generative model


Improved Techniques for Training Score-Based Generative Models

Neural Information Processing Systems

Score-based generative models can produce high quality image samples comparable to GANs, without requiring adversarial optimization. However, existing training procedures are limited to images of low resolution (typically below 32 x 32), and can be unstable under some settings. We provide a new theoretical analysis of learning and sampling from score models in high dimensional spaces, explaining existing failure modes and motivating new solutions that generalize across datasets. To enhance stability, we also propose to maintain an exponential moving average of model weights. With these improvements, we can effortlessly scale score-based generative models to images with unprecedented resolutions ranging from 64 x 64 to 256 x 256. Our score-based models can generate high-fidelity samples that rival best-in-class GANs on various image datasets, including CelebA, FFHQ, and multiple LSUN categories.


Review for NeurIPS paper: Improved Techniques for Training Score-Based Generative Models

Neural Information Processing Systems

Weaknesses: The experimental section does not quite have the experiments I was hoping for. I was hoping to understand *which* techniques were important to scale the model to higher-resolution images. I know that using all techniques are needed for the model to learn LSUN images, but my suspicion is that only a subset of them (perhaps only EMA) is needed for regular NCSN to learn high-resolution images. Better understanding of which techniques are important would help explain what is needed for scaling the model. Moreover, for the ablation experiments provided, it does not seem that all 5 techniques are need for all datasets. Fig 5 seems to show that the simplified network is not needed on CelebA 64x64 as FID is better using the original network.


Review for NeurIPS paper: Improved Techniques for Training Score-Based Generative Models

Neural Information Processing Systems

Some minor concerns were raised about some missing details and the way the results are presented, but the reviewers agree that the author feedback addresses most of these concerns satisfactorily. R2 points out that the sensitivity of FID to noise has been observed previously in the literature. I recommend that the authors take this into account when they update the manuscript.


Improved Techniques for Training Score-Based Generative Models

Neural Information Processing Systems

Score-based generative models can produce high quality image samples comparable to GANs, without requiring adversarial optimization. However, existing training procedures are limited to images of low resolution (typically below 32 x 32), and can be unstable under some settings. We provide a new theoretical analysis of learning and sampling from score models in high dimensional spaces, explaining existing failure modes and motivating new solutions that generalize across datasets. To enhance stability, we also propose to maintain an exponential moving average of model weights. With these improvements, we can effortlessly scale score-based generative models to images with unprecedented resolutions ranging from 64 x 64 to 256 x 256. Our score-based models can generate high-fidelity samples that rival best-in-class GANs on various image datasets, including CelebA, FFHQ, and multiple LSUN categories.